272 research outputs found
Focused information criterion and model averaging for generalized additive partial linear models
We study model selection and model averaging in generalized additive partial
linear models (GAPLMs). Polynomial spline is used to approximate nonparametric
functions. The corresponding estimators of the linear parameters are shown to
be asymptotically normal. We then develop a focused information criterion (FIC)
and a frequentist model average (FMA) estimator on the basis of the
quasi-likelihood principle and examine theoretical properties of the FIC and
FMA. The major advantages of the proposed procedures over the existing ones are
their computational expediency and theoretical reliability. Simulation
experiments have provided evidence of the superiority of the proposed
procedures. The approach is further applied to a real-world data example.Comment: Published in at http://dx.doi.org/10.1214/10-AOS832 the Annals of
Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical
Statistics (http://www.imstat.org
Sentence-Level Content Planning and Style Specification for Neural Text Generation
Building effective text generation systems requires three critical
components: content selection, text planning, and surface realization, and
traditionally they are tackled as separate problems. Recent all-in-one style
neural generation models have made impressive progress, yet they often produce
outputs that are incoherent and unfaithful to the input. To address these
issues, we present an end-to-end trained two-step generation model, where a
sentence-level content planner first decides on the keyphrases to cover as well
as a desired language style, followed by a surface realization decoder that
generates relevant and coherent text. For experiments, we consider three tasks
from domains with diverse topics and varying language styles: persuasive
argument construction from Reddit, paragraph generation for normal and simple
versions of Wikipedia, and abstract generation for scientific articles.
Automatic evaluation shows that our system can significantly outperform
competitive comparisons. Human judges further rate our system generated text as
more fluent and correct, compared to the generations by its variants that do
not consider language style.Comment: Accepted as a long paper to EMNLP 201
- …